Improved Automatic Keyword Extraction Based on TextRank Using Domain Knowledge
نویسندگان
چکیده
Keyword extraction of scientific articles is beneficial for retrieving scientific articles of a certain topic and grasping the trend of academic development. For the task of keyword extraction for Chinese scientific articles, we adopt the framework of selecting keyword candidates by Document Frequency Accessor Variety(DF-AV) and running TextRank algorithm on a phrase network. To improve domain adaption of keyword extraction, we introduce known keywords of a certain domain as domain knowledge into this framework. Experimental results show that domain knowledge can improve performance of keyword extraction generally.
منابع مشابه
TextRank: Bringing Order Into Texts
In this paper, we introduce TextRank – a graph-based ranking model for text processing, and show how this model can be successfully used in natural language applications. In particular, we propose two innovative unsupervised methods for keyword and sentence extraction, and show that the results obtained compare favorably with previously published results on established benchmarks.
متن کاملDegExt - A Language-Independent Graph-Based Keyphrase Extractor
In this paper, we introduce DegExt, a graph-based languageindependent keyphrase extractor,which extends the keyword extraction method described in [6]. We compare DegExt with two state-of-the-art approaches to keyphrase extraction: GenEx [11] and TextRank [8]. Our experiments on a collection of benchmark summaries show that DegExt outperforms TextRank and GenEx in terms of precision and area un...
متن کاملAutomatic Generation of Personalized Annotation Tags for Twitter Users
This paper introduces a system designed for automatically generating personalized annotation tags to label Twitter user’s interests and concerns. We applied TFIDF ranking and TextRank to extract keywords from Twitter messages to tag the user. The user tagging precision we obtained is comparable to the precision of keyword extraction fromweb pages for content-targeted advertising.
متن کاملAutomatic Summarization for Terminology Recommendation: The Case of the NCBO Ontology Recommender
The National Center for Biomedical Ontology (NCBO) ontology recommender helps users choose a biomedical terminology by analyzing a submitted document. Submitting a single document might not be representative and result in poor recommendations, while submitting a large sample might be expensive, sometimes unfeasible. In this paper, we investigate the effectiveness of two well-researched automati...
متن کاملAutomatic Keyword Extraction Using Domain Knowledge
Documents can be assigned keywords by frequency analysis of the terms found in the document text, which arguably is the primary source of knowledge about the document itself. By including a hierarchically organised domain speciic thesaurus as a second knowledge source the quality of such keywords was improved considerably, as measured by match to previously manually assigned keywords.
متن کامل